Some packages we will use:
library(tidyverse)
library(leaflet)
library(sf)
library(terra)
library(tidyterra)
library(ggmap)
library(DHARMa)
We will look at some records of Koalas I got from the Atlas of Living Australia. I picked a somewhat arbitrary radius including Armidale.
koalas <- read_csv('maps_workshop_material/records-2023-06-20.csv')
## Rows: 650 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): dataResourceUid, geodeticDatum, verbatimCoordinateSystem, species,...
## dbl (4): decimalLatitude, decimalLongitude, verbatimLatitude, verbatimLongi...
## dttm (1): eventDate
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
ggplot(koalas,aes(x=decimalLongitude,y=decimalLatitude))+
geom_point()
Just points are a little boring/unhelpful. One quick/easy option is leaflet:
m <- leaflet() %>% setView(lng = mean(koalas$decimalLongitude),
lat = mean(koalas$decimalLatitude),
zoom=12)
m %>% addTiles() %>% addMarkers(data = koalas,
~decimalLongitude,
~decimalLatitude,
#clusterOptions = markerClusterOptions()
)
Note we can do the cluster thing if we want – it is quite helpful.
Interactive maps are cool but we maybe want something more static for
papers etc. ggmap can be a bit annoying but is one way:
a_box <- c(left = min(koalas$decimalLongitude) - .01,
right = max(koalas$decimalLongitude) + .01,
bottom = min(koalas$decimalLatitude) - .01,
top = max(koalas$decimalLatitude) + .01)
aMap <- get_stamenmap(a_box,zoom=11,maptype = "terrain")
## ℹ Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.
ggmap(aMap) +
geom_point(data = koalas,
aes(x=decimalLongitude,
y=decimalLatitude))
#because of the API its a bit clunky
Perhaps we would like some science. When working with maps there are
two main data types you want to know: raster data and vector data.
Vector data are shapes – lines, polygons, etc., whereas rasters are
things that have pixels. You deal with vectors using sf and
rasters using terra (previously you used sp
and raster)
I’ve given an example of each: some land cover data in a raster, and some ABS boundaries which are vector polygons.
landCover <- rast('./maps_workshop_material/ga_ls_landcover_class_cyear_2_1-0-0_au_x18y-35_2020-01-01_level4_rgb.tif')
landCover
## class : SpatRaster
## dimensions : 4000, 4000, 3 (nrow, ncol, nlyr)
## resolution : 25, 25 (x, y)
## extent : 1800000, 1900000, -3500000, -3400000 (xmin, xmax, ymin, ymax)
## coord. ref. : GDA94 / Australian Albers (EPSG:3577)
## source : ga_ls_landcover_class_cyear_2_1-0-0_au_x18y-35_2020-01-01_level4_rgb.tif
## colors RGB : 1, 2, 3
## names : Red, Green, Blue
abs_poly <- st_read('./maps_workshop_material/SA1_2021_subset.shp')
## Reading layer `SA1_2021_subset' from data source
## `C:\Users\rcope4\OneDrive - University of New England\Documents\teaching\misc workshops etc\maps\maps_workshop_material\SA1_2021_subset.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 512 features and 17 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 148.6762 ymin: -31.85823 xmax: 152.6314 ymax: -28.24899
## Geodetic CRS: GCS_GDA2020
abs_poly
## Simple feature collection with 512 features and 17 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 148.6762 ymin: -31.85823 xmax: 152.6314 ymax: -28.24899
## Geodetic CRS: GCS_GDA2020
## First 10 features:
## SA1_CODE21 CHG_FLAG21 CHG_LBL21 SA2_CODE21 SA2_NAME21 SA3_CODE21 SA3_NAME21
## 1 11001118601 0 No change 110011186 Armidale 11001 Armidale
## 2 11001118602 0 No change 110011186 Armidale 11001 Armidale
## 3 11001118603 0 No change 110011186 Armidale 11001 Armidale
## 4 11001118605 0 No change 110011186 Armidale 11001 Armidale
## 5 11001118606 0 No change 110011186 Armidale 11001 Armidale
## 6 11001118607 0 No change 110011186 Armidale 11001 Armidale
## 7 11001118608 0 No change 110011186 Armidale 11001 Armidale
## 8 11001118609 0 No change 110011186 Armidale 11001 Armidale
## 9 11001118611 0 No change 110011186 Armidale 11001 Armidale
## 10 11001118612 0 No change 110011186 Armidale 11001 Armidale
## SA4_CODE21 SA4_NAME21 GCC_CODE21 GCC_NAME21 STE_CODE21
## 1 110 New England and North West 1RNSW Rest of NSW 1
## 2 110 New England and North West 1RNSW Rest of NSW 1
## 3 110 New England and North West 1RNSW Rest of NSW 1
## 4 110 New England and North West 1RNSW Rest of NSW 1
## 5 110 New England and North West 1RNSW Rest of NSW 1
## 6 110 New England and North West 1RNSW Rest of NSW 1
## 7 110 New England and North West 1RNSW Rest of NSW 1
## 8 110 New England and North West 1RNSW Rest of NSW 1
## 9 110 New England and North West 1RNSW Rest of NSW 1
## 10 110 New England and North West 1RNSW Rest of NSW 1
## STE_NAME21 AUS_CODE21 AUS_NAME21 AREASQKM21
## 1 New South Wales AUS Australia 1.8617
## 2 New South Wales AUS Australia 34.3521
## 3 New South Wales AUS Australia 39.1836
## 4 New South Wales AUS Australia 46.2758
## 5 New South Wales AUS Australia 39.5515
## 6 New South Wales AUS Australia 0.6526
## 7 New South Wales AUS Australia 0.2007
## 8 New South Wales AUS Australia 0.2262
## 9 New South Wales AUS Australia 34.9480
## 10 New South Wales AUS Australia 0.2459
## LOCI_URI21
## 1 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118601
## 2 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118602
## 3 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118603
## 4 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118605
## 5 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118606
## 6 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118607
## 7 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118608
## 8 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118609
## 9 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118611
## 10 http://linked.data.gov.au/dataset/asgsed3/SA1/11001118612
## geometry
## 1 POLYGON ((151.6426 -30.5008...
## 2 POLYGON ((151.7053 -30.5413...
## 3 POLYGON ((151.6969 -30.4698...
## 4 POLYGON ((151.5733 -30.4850...
## 5 POLYGON ((151.5869 -30.5545...
## 6 POLYGON ((151.6433 -30.4989...
## 7 POLYGON ((151.6744 -30.4954...
## 8 POLYGON ((151.6707 -30.4977...
## 9 POLYGON ((151.7049 -30.5277...
## 10 POLYGON ((151.6876 -30.5070...
Note the ABS polygons are a subset – the data I obtained from the website was every polygon in Australia and quite large.
We can plot them:
ggplot()+geom_spatraster_rgb(data = landCover)
## SpatRaster resampled to ncells = 501264
Note we needed tidyterra for the geom above
ggplot(abs_poly)+geom_sf()
The raster is a bit annoying to work with / extract information from (though there is a lot of information here – 4000x4000 25m land-cover measurements). The sf is a lot easier to deal with, its kinda just a data frame but with one column being the polygons. This means that you can use the standard R tools you use for data wrangling.
The polygons themselves are a little boring; The main reason I’d use ABS polygons is to use ABS data (you have some)
abs_data <- read_csv('maps_workshop_material/2021_SA1_NSW_subset.csv')
## Rows: 512 Columns: 281
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (281): SA1_CODE_2021, F_GrDi_GrCer_GrDi_Mng, F_GrDi_GrCer_GrDi_Pro, F_Gr...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
abs_data <- abs_data %>% select(SA1_CODE_2021,
PhDs = P_PoDe_DoctDe_Tot,
pop = P_Tot_Tot)
To combine these / make the map, we need to ensure they share a column that has the same values to match on (particularly here, the same type of values). So we do:
abs_poly %>% filter(SA2_NAME21 == 'Armidale') %>%
mutate(SA1_CODE_2021 = as.numeric(SA1_CODE21)) %>%
left_join(abs_data) %>%
ggplot(aes(fill = PhDs/pop)) + geom_sf() +
geom_point(data = koalas,
inherit.aes = FALSE,
col='red',
aes(x=decimalLongitude,
y=decimalLatitude))
## Joining with `by = join_by(SA1_CODE_2021)`
An annoying thing you will often encounter is coordinate references systems (CRS). Because the world is a sphere, when you make a map you need to deform that sphere in some way (to make it flat) which is described by the CRS. Both the objects we imported tell you their CRS, I think epsg.io is helpful here, like epsg.io/3577
You see the problem if you try:
ggplot()+geom_spatraster_rgb(data = landCover)+
#coord_sf(crs=st_crs(7844))+ #7844 is GDA2020
geom_point(data = koalas,
aes(x=decimalLongitude,y=decimalLatitude),
col='red')
## SpatRaster resampled to ncells = 501264
Because of the CRS this is completely wacky – fortunately the coord_sf()
thing lets us fix this easily.
ggplot()+geom_spatraster_rgb(data = landCover)+
coord_sf(crs=st_crs(7844))+ #7844 is GDA2020
geom_point(data = koalas,
aes(x=decimalLongitude,y=decimalLatitude),
col='red')
## SpatRaster resampled to ncells = 501264
So, try to be mindful of what is happening when you import map objects, and if something weird happens it is probably the CRS.
Lets try to use these data to fit a model for the number of koalas in an SA1. To do this we need to count the koalas in each SA1, and also take the average of the landcover in each region.
#we need to convert the koala data to a spatial
#object, then we can use st_intersects()
koalas_st <- st_as_sf(koalas,coords = c('decimalLongitude',
'decimalLatitude'),
crs = st_crs(abs_poly))
abs_poly$koalas <- lengths(st_intersects(abs_poly,koalas_st))
#for the landcover, we need to project it so
#it is in the same CRS as the polygons, then
#use extract to take the mean
landCoverProj <- project(landCover, "EPSG:7844")
##
|---------|---------|---------|---------|
=========================================
#this will take a while
ex1 <- extract(landCoverProj, abs_poly,mean)
abs_poly$Red <- ex1$Red
abs_poly$Green <- ex1$Green
abs_poly$Blue <- ex1$Blue
armidale <- abs_poly %>%
filter(SA2_NAME21 == 'Armidale') %>%
mutate(SA1_CODE_2021 = as.numeric(SA1_CODE21)) %>%
left_join(abs_data)
## Joining with `by = join_by(SA1_CODE_2021)`
An extremely naive model would be a Poisson GLM with these linear predictors, and an offset by the log of the area of the regions to take into account the fact that they are of different size. DHARMa lets us consider our modelling assumptions.
glm1 <- glm(koalas ~ Red + Green+ Blue + PhDs+ pop +
offset(log(AREASQKM21)), data = armidale,
family = 'poisson')
plot(simulateResiduals(glm1))
This is clearly not a particularly good model (it would take some thought here to decide how to deal with the populations given the areas of the regions etc.), but it is a useful example having done our map computations.